Improved Approximation Algorithms for Earth-Mover Distance in Data Streams
نویسندگان
چکیده
For two multisets S and T of points in [∆], such that |S| = |T | = n, the earth-mover distance (EMD) between S and T is the minimum cost of a perfect bipartite matching with edges between points in S and T , i.e., EMD(S, T ) = minπ:S→T ∑ a∈S ||a−π(a)||1, where π ranges over all one-to-one mappings. The sketching complexity of approximating earth-mover distance in the two-dimensional grid is mentioned as one of the open problems in [16, 11]. We give two algorithms for computing EMD between two multi-sets when the number of distinct points in one set is a small value k = log(∆n). Our first algorithm gives a (1 + )-approximation using O(k −2 log n) space and works only in the insertion-only model. The second algorithm gives a O(min(k, log∆))approximation using O(log∆ · log log∆ · logn)-space in the turnstile model.
منابع مشابه
Sketching Earth-Mover Distance on Graph Metrics
We develop linear sketches for estimating the Earth-Mover distance between two point sets, i.e., the cost of the minimum weight matching between the points according to some metric. While Euclidean distance and Edit distance are natural measures for vectors and strings respectively, Earth-Mover distance is a well-studied measure that is natural in the context of visual or metric data. Our work ...
متن کاملOn constant factor approximation for earth mover distance over doubling metrics
Given a metric space (X, dX), the earth mover distance between two distributions over X is defined as the minimum cost of a bipartite matching between the two distributions. The doubling dimension of a metric (X, dX) is the smallest value α such that every ball in X can be covered by 2 ball of half the radius. A metric (or a sequence of metrics) is called doubling precisely if its doubling dime...
متن کاملCompact Data Representations and their Applications
Several algorithmic techniques have been devised recently to deal with large volumes of data. At the heart of many of these techniques are ingenious schemes to represent data compactly. This talk will present some constructions of such compact representation schemes (also referred to as sketches) for estimating distances between sets, vectors, and distributions on an underlying metric (where di...
متن کاملSpace-Efficient Approximation Scheme for Circular Earth Mover Distance
The Earth Mover Distance (EMD) between point sets A and B is the minimum cost of a bipartite matching between A and B. EMD is an important measure for estimating similarities between objects with quantifiable features and has important applications in several areas including computer vision. The streaming complexity of approximating EMD between point sets in a two-dimensional discretized grid i...
متن کاملRademacher-Sketch: A Dimensionality-Reducing Embedding for Sum-Product Norms, with an Application to Earth-Mover Distance
Consider a sum-product normed space, i.e. a space of the form Y = `1 ⊗ X , where X is another normed space. Each element in Y consists of a length-n vector of elements in X , and the norm of an element in Y is the sum of the norms of its coordinates. In this paper we show a constant-distortion embedding from the normed space `1 ⊗X into a lower-dimensional normed space ` ′ 1 ⊗ X , where n′ n is ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1404.6287 شماره
صفحات -
تاریخ انتشار 2014